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Abstract. We investigate the effect of peculiar velocities on the redshift space distribution of z <; 2 galaxies, and we focus in 
particular on Lya emitters. We generate catalogues of dark matter (DM) halos and identify emitters with halos of the same co- 
moving space density (M(Lya emitters) ~ 3 x 1O 11 M0). We decompose the peculiar velocity field of halos into streaming, 
gradient and random components, and compute and analyse these as a function of scale. Streaming velocities are determined 
by fluctuations on very large scales, strongly affected by sample variance, but have a modest impact on the interpretation of 
observations. Gradient velocities are the most important as they distort structures in redshift space, changing the thickness and 
orientation of sheets and filaments. Random velocities are typically below or of the same order as the typical observational 
uncertainty on the redshift. We discuss the importance of these effects for the interpretation of data on the large-scale structure 
as traced by Lya emitters (or similar kinds of astrophysical high-redshift objects), focusing on the induced errors in the viewing 
angles of filaments. We compare our predictions of velocity patterns for Lya emitters to observations and find that redshift 
clumping of Lya emitters, as reported for instance in the fields of high-redshift radio galaxies, does not allow to infer whether 
an observed field is sampling an early galaxy overdensity. 
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1. Introduction 

It has become feasible to obtain accurate redshifts for large 
samples of distant objects and produce 3-dimensional maps 
of the universe out to redshifts 3 and beyond. This has al- 
lowed quantitative studies of the large-scale structure of the 
distant Universe using Lyman-Break Galaxies (LBGs, see, 
e.g., Adelberger et al. 1998; Miley et al. 2004), Lya emitters 
(Warren & M0ller 1996; Steidel et al. 2000; M0ller & Fynbo 
2001 ; Fynbo, M0ller & Thomsen 2001 ; Shimasaku et al. 2003), 
extremely red objects (Daddi et al. 2003), or radio galaxies 
(Pentericci et al. 2000; Venemans et al. 2002) as tracers. These 
surveys are consistent with the galaxies tracing the character- 
istic filamentary pattern, aptly called the 'cosmic web' in the 
dark matter, a generic feature of structure formation in a cold 
dark matter dominated universe. 

In such 3D maps the third dimension is given by redshift 
and therefore they are deformed by the peculiar velocities of 
the galaxies. For example, infall onto clusters introduces a char- 
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acteristic 'finger of god' pattern in the CfA redshift survey 
(Huchra et al., 1983). The peculiar velocities can be thought 
of as the sum of three components (i) a streaming flow (driven 
by fluctuations on very large scales), (ii) a coherent compo- 
nent that distorts the large-scale structure and (Hi) a small- 
scale noise term due to highly non-linear motions. Peculiar 
motions may change the apparent inclination angle at which 
filaments are observed, so if one wishes to apply an Alcock- 
Paczynski test on the distribution of inclination angles of fila- 
ments (Alcock & Paczynski 1979; M0ller & Fynbo 2001), then 
it is imperative to understand at which scales such an analy- 
sis will not be systematically biased (Weidinger et al. 2002). 
Peculiar motions may also lead to an apparent enhancement in 
the dumpiness of the redshift distribution of Lya emitters in 
deep narrow-band observations, as we will show below. 

In this paper we follow Haehnelt, Natarajan & Rees (1998) 
and identify Lya emitters with halos in our DM simulations 
by requiring that the halos of the corresponding mass have the 
same number density as the Lya emitters. Such a simple bias- 
ing scheme should work well if the duty cycle of Lya emis- 
sion is close to one. We then investigate to what extent peculiar 
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velocities influence the observed distribution of Lya emitters. 
Although we focus on the large-scale structure as traced by Lya 
emitters, our conclusions can be applied to other classes of ob- 
jects as well. 

This paper is organised as follows. In Sect. 2 we decom- 
pose the velocity field of DM halos into streaming, gradient 
and random components, and show how to estimate such ve- 
locity components on DM halo catalogues generated with the 
PINOCCHIO code (Monaco et al. 2002). In Sect. 3 we quantify 
the velocity components and give analytic fits to reproduce the 
results. The observational consequences of these results on the 
reconstruction of viewing angles of filaments and on the red- 
shift distribution of Lya emitters in narrow band imaging se- 
lected volumes at high redshift are given in Section 4. Section 5 
concludes. More details on the extension of the PINOCCHIO 
code to multi-scale runs and on the connection between DM 
halos and observed astrophysical objects are given in three ap- 
pendices. 

In this paper we assume a scale invariant, vacuum energy 
dominated flat universe with parameters (fi m , Qa, n, h,crs) = 
(0.3, 0.7, 1, 0.7, 0.9) (e.g. Spergel et al. 2003), where the sym- 
bols have their usual meaning. 

2. Quantification of peculiar velocity 
components 

Consider a set of DM halos in a given volume; the (highly cor- 
related) peculiar velocity field traced by these halos can be de- 
composed in three components that have different effects on 
observations in the redshift space (e.g. Weidinger et al. 2002): 
(i) the mean velocity of the set, or streaming velocity, (ii) a 
gradient component of velocities along the volume and ( Hi) the 
residual (random) component. Performing a Taylor expansion 
of the peculiar velocity around the set's mean velocity, these 
components are: 



V W = a ~J7 



r/a 
dx. 
dt 
v(x ) 



[(x - x ) • V x ] v (x ) + v r (x). 



(1) 



Here, a is the scale factor, x is the co-moving position 
of the halo, and the peculiar velocity v at position x is writ- 
ten as a sum of a streaming, gradient and random component. 
The geometrical meaning of the three terms is straighforward: 
the streaming velocity gives the bulk motion of the whole set; 
the gradient component is responsible for distortions of the 
redshift-space patterns and can change the shape of the struc- 
ture traced by the set; random velocities are considered here as 
a noise. Note that the gradient component is a tensor, but we 
will restrict ourselves to the diagonal components: 



hj = dvi/dxi 



(2) 



These velocity components are of course only defined once 
the volume containing the set has been identified, i.e. once a 
(co-moving) scale has been chosen. In other words, these quan- 
tities are scale-dependent. 



2. 1. Velocity components in linear theory 

It is useful to consider the predictions of linear theory for the 
three velocities defined above (equation Q as a function of 
scale, both to have an idea of their behaviour and as a basis for 
constructing analytic fits of the simulation results in the next 
section (see also Hamana et al. 2003). 

In the Zel'dovich (1970) approximation, the peculiar veloc- 
ity v of a halo is related to the gravitational potential <\> by (e.g. 
Peebles 1980) 



-a(t)D(t)V(f) 



(3) 



where a dot denotes a time derivative. In turn, <fi is related to 
the underlying density field <5(x, t) = p(x, t)/(p(x, t)) through 
Poisson's equation, 



1 



(x,t) = -<5(x,t) 



(4) 



In these equations D(t) is the growing mode of the den- 
sity perturbation. If P(k) denotes the power spectrum of den- 
sity fluctuations, then Poisson's equation combined with equa- 
tion shows that the spectrum of velocity perturbations is 
P v oc k~ 2 P(k). Therefore the (ID) variance of the peculiar 
velocity in linear theory is 



P(k)dk. 



(5) 



Note that <r 2 converges readily on small scales where P{k) oc 
k n with n ~ —2 to —3, but converges only slowly on large 
scales, as is well known. 

We now focus on a system of size R to decompose a 2 in 
terms of a streaming, gradient, and random component. The 
variance of the streaming velocity of the matter in the volume 
V can be computed by smoothing cf> with a window function 
Wfl(q) (with V ~ R 3 ). If W denotes the Fourier transform of 
W, then 



XR) = -(aDYa 2 (R) 



rand 



(R) 



where 



2vr 2 



k l P{k)W 2 {kR)dk. 



Similarly, the rms gradient 



4adOR) = \{at)fal{R). 



(6) 
(7) 

(8) 
(9) 



Note that u 2 (R) in this equations is also the mass variance on 
scale R. In the following we will use a top-hat filter, so that the 
mass variance will be related to spherical volumes of radius R. 

In Fig.^we plot the streaming, gradient and random com- 
ponents predicted by linear theory at z — 2 as a function of 
scale R. Streaming velocities dominate on scales smaller than 
~ 30 Mpc; on larger scales random velocities dominate. The 
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cross-over scale is a measure of the velocity correlation in lin- 
ear theory, and is similar to the size of the largest observable 
structure (filament or void) at the given redshift. The rms gradi- 
ent (Tgrad (R) is usually a small fraction of the Hubble constant 
H(z). 

In the next section we compute these variances using sim- 
ulations. 

2.2. Simulations 

As we saw in the previous section, peculiar velocity fields 
are correlated over large scales, hence simulations need to be 
performed in a large volume to adequately sample the large- 
scale modes. Combined with the need to be able to resolve 
small halos, we require simulations with a large dynamic range. 
Furthermore, to test convergence of results and decrease sam- 
ple variance we need to run many simulations. The PINOCCHIO 
algorithm (Monaco et al. 2002a; Monaco, Theuns & Taffoni 
2002b; Taffoni, Monaco & Theuns 2002) is ideally suited for 
this purpose. 

PINOCCHIO uses Lagrangian perturbation theory and an al- 
gorithm to mimic the hierarchical build-up of DM halos to pre- 
dict the masses, positions and velocities of dark matter halos 
as a function of time. The agreement between PINOCCHIO and 
a full scale iV-body simulation is very good, even when com- 
paring the properties of individual halos. PINOCCHIO does not 
compute the density profile of the halos, and as a consequence 
is many orders of magnitude faster than an iV-body simulation. 
A simulation with 256 3 particles requires a few hours on a PC. 

As we will show below, a typical Lya emitter lies in a halo 
of mass ~ 3 x 10 11 M Q , so if we want to resolve a halo with 
150 particles, then the particle mass is 2 x 10 9 M Q . In a simu- 
lation with 256 3 particles, and given our assumed cosmology, 
this limits our box size to ~ 65 ft- x Mpc, too small to properly 



Table 1. pinocchio runs performed, the particle mass is 6.7 x 
10 8 . L\ and Ai (L2 and A2) are the size of the low (high) 
resolution box and grid spacing, respectively. Runs of given 
type only differ in the random seed. 



run id 


# of runs 


U 


L1/A1 


L 2 


L2/A2 






ft -1 Mpc 








PI 


10 






65 


256 


P2 


11 


520 


64 


65 


256 



sample the large-scale velocity field. Fortunately it not neces- 
sary to perform much larger, computationally expensive sim- 
ulations, because it is straightforward to add long-wavelength 
perturbations to PINOCCHIO. This is explained in Appendix A, 
while Appendix B quantifies the accuracy of PINOCCHIO in re- 
producing the velocity components defined above. 

To address convergence of peculiar velocities and sample 
variance, we have run many realisations (TableQ. For the ref- 
erence cosmology, we have performed 10 standard PINOCCHIO 
runs with a single grid of size L = 65 ft -1 Mpc and grid spac- 
ing A such that L/A = 256 (A = 0.254 ft- 1 Mpc), and 11 
PINOCCHIO runs with two grids, using a high-resolution grid 
with L2 = 65 ft -1 , L2/A2 = 256, and a low-resolution with 
L x = 8L 2 = 520 ft" 1 Mpc, Lx/At = 64 (A = 8.125 ft- 1 
Mpc). 

2.3. Computing velocity statistics from halo 
catalogues 

To compute streaming, gradient and random velocities from 
the PINOCCHIO runs, we have subdivided the simulated boxes 
into n? cubic sub-boxes of side l su \, = lbox/n, where n is a 
running integer. For each subdivision n, each sub-box (j, n) 
(where j = [0, n 3 ]) is centred on xo,j,n an d contains Nj. n dark 
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Fig. 2. Velocity statistics for halos with mass M > 10 11 M Q at z = 2 in the PINOCCHIO simulations of Table 1 . Thick lines and 
thin lines denote the mean and 1-cr dispersion of these statistics respectively, for runs PI (dashed) and P2 (full lines). Note how 
the PI simulation severely underestimates the streaming velocity because of the missing large-scale power in the small simulation 
box. 



matter halos more massive than a given threshold mass. For 
each sub-box (j, n) and for each spatial component i we com- 
pute the streaming and gradient velocities as the zero point and 
slope of a linear regression with respect to position xi of the 
velocities Vs of all the halos: 



Ui(xo,j,n) 



dvj 
dx, 



( X 0,i,n) 



E A E v i - E x i E x i v i 
■V.-E.'-f (E^) 2 

^■,«E^ 2 -(E^) 2 



(ID 



The sum is over the Nj >n halos in the sub-box (j, n). Random 
velocities are computed as the residuals of velocities with re- 
spect to the linear regression. For each sub-box (j, n) and for 
each component i we compute the variance of these residuals: 



°f,i( x O,j,ra) = 

var U; 4 (x) - v 4 (x 0j >) - (x - xoj>)i^-(x 0j ,„) 



(12) 



Finally, for each sub-box size n we compute the variance 
of these quantities over all sub-boxes j that contain at least 5 
objects (Nj :n > 5), and express the result as a function of the 
sub-box length 4ut>: 



y stream,i(^sub) 

^grad,i(^sub) 
^rand^Gsub) 



(^(x 0j >))' 
' dvi 

dx~ { ^ n) 

<^r,i( x 0,j,n)) . 



(13) 



Notice that the second quantity, ft.g ra d,i> nas me same di- 
mension as the Hubble constant, while the other two are pure 
velocities. In the following, where not otherwise stated, we will 
show for the three velocity statistics the average over the three 
directions. 



( 10 ) 3. Results 



Fig.|2]shows the streaming, gradient and random velocity vari- 
ances for halos larger than 10 11 M Q at z = 2, as a function 
of scale averaged over the PI simulations (dashed lines). As 
for linear theory, on small scales streaming flows dominate and 
random velocities are relatively small, while the reverse is true 
at larger scales. However, the relatively modest value of the 
streaming velocity is mostly an artifact connected to the small 
box size and the resulting poor sampling of the large scale 
modes; in a low-density universe velocity fields are highly cor- 
related, and to achieve convergence for the streaming velocity 
it is necessary to average over volumes as large as several hun- 
dred Mpc. Fig. n shows that the expected streaming flow ve- 
locity drops below 50 km s _1 only at scales of ^200 Mpc. The 
P2 simulations do not suffer from lack of large scale power, 
and the streaming velocities are much larger (Fig. [2}- Note that 
the random and gradient dispersions are similar in the smaller 
boxes. 

Due to the large correlation length of the velocity field, 
the three velocity statistics (especially the streaming motions) 
show a great deal of variation among different realisations, 
which we call sample variance. Fig. [5] shows the random and 
streaming velocities for each of the P2 runs separately. Notice 
that the PI realisations artificially suppress some of the sam- 
ple variance by imposing periodic boundary conditions, while 
the P2 realisations do not have this drawback. Here we show 
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Fig. 3. Sample variance of velocity statistics for the 11 runs 
of the P2 simulation. Continuous, dotted and dashed lines de- 
note streaming, gradient and random velocity variances, re- 
spectively. Here we show the three spatial components sepa- 
rately. The 12th panel contains the average variance for all runs, 
with the corresponding 1-cr dispersion. 



in each panel the streaming, gradient and random velocities as 
computed in the three directions. The curves fluctuate much 
from one realisation to the other. The 12th panel shows the av- 
erage with la fluctuations, the same quantity shown in Fig. [2] 

This figure shows clearly the importance of a proper quan- 
tification of sample variance. This is important not only to test 
the reliability of the predictions, but also to quantify the inter- 
val in which observed data are expected. 

The velocity statistics shown above depend on DM halo 
mass, redshift and cosmology. Figs.|4]and|5]illustrate the mass 
and redshift dependence. In the same figures we show analytic 
fits, based on linear theory, to the velocity statistics. In particu- 
lar, streaming velocities are reasonably well fit on the scales of 
interest by the simple linear theory prediction of equation Q 
A weak mass dependence is noticeable, however it is much 
stronger for gradient and random velocities. Such a mass de- 
pendence is expected since halos are biased tracers of the mass 
(see also Hamana et al. 2003). 

Because of the self-similar character of gravity, we expect 
to be able to fit the mass-dependence by a simple function of 
the spectral moments <j\ (equation[8}. The mass of the halo de- 
pends on (T2 (-R), while the velocity variances depend on <Jo(R) 
(equations [6] and [7}. The (top-hat) co-moving smoothing ra- 
dius R is connected to the halo mass M through the relation 
47ri? 3 /9o = M, where po is the actual average matter density. 
With this M — R relation the mass variance relative to M, 



D(z)a(M) is then computed 1 . The mass dependence of ran- 
dom and gradient velocities is then reasonably well reproduced 
(Fig.g]|5}by 



v iandM (R) = j=(aD)ot>(R) . t- y D{z)(T2{R) 



0.5 



(14) 



(15) 



where 4irR 3 po = M. They give acceptable fits at scales larger 
than 10 co-moving Mpc, although some residual mass depen- 
dence is present; in particular, more massive objects are not 
perfectly reproduced. 

We have verified that the dependence on cosmological pa- 
rameters is correctly reproduced by these fits by performing ad- 
ditional PINOCCHIO simulations, using the same random seeds 
to be less affected by sample variance. 

4. Observational consequences 

In the previous sections we characterised the effect of peculiar 
velocities on the distribution of halos in redshift space. To ap- 
ply this result to galaxies we need to known how to associate 
galaxies with dark matter halos. In this section we apply a very 
simple biasing scheme where we associate galaxies of a given 
type with halos with the same co-moving space density. More 
complex schemes have appeared in the literature (e.g. based on 
the halo occupation distribution, Berlind et al. 2003) but our 
model has the advantage of simplicity and it is sufficient for 
our purpose 2 . 

Lya emitters are the most numerous emission selected ob- 
jects known at high redshifts, suggesting that they must in- 
habit relatively low-mass halos. In a deep search in two fields 
at z=2.85 and z=3.15, Fynbo et al. (2003) determined the co- 
moving space density riLya of spectroscopically confirmed 
Lya emitters down to their Lya flux detection limit of 7x 10 -18 
erg s _1 cm~ 2 to be \og(nL ya ) — —2.6. Assuming that 100% of 
DM halos host a Lya emitter, the measured space density in our 
cosmology is typical of halos of mass 6 x 10 11 M©. The duty 
cycle could be lower than 100%; for LBGs, only 25% show 
significant Lya emission (e.g., Shapley et al. 2003). However, 
this is likely to be a lower limit to the duty cycle of typical Lya 
emitters, that have smaller star formation rates and then are less 
affected by dust obscuration. If a 25% duty cycle is adopted, 
the corresponding halo mass decreases to 2 x 10 11 M Q . These 
numbers should bracket the solution, and justify the choice of 
— 3 x 10 11 M Q anticipated in SectionIO 



1 Notice that the mass variance a(M) is, by definition, linearly ex- 
trapolated to z = 0, to obtain its value at z it is multiplied by the 
growing mode D(z). 

1 Notice that in this scheme we only assume that a galaxy of a given 
(stellar) mass is typically associated to a DMH of a given mass. We do 
not assume a direct proportionality between galaxy and DMH mass, 
which we know it is not present in real galaxies (see, e.g., Persic, 
Salucci & Stel 1996). 
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Fig. 4. Velocity statistics for different halo masses at z — 2. The average and variance for the simulation P2 are shown. Dotted 
lines give the analytic fits (eauationsl7lll4land ll5t . 
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4. 1. The influence of velocities on Lya filaments 
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Several properties combine to make Lya emitters a good tracer 
for mapping large-scale structure. Because they have higher 
space density than any other class of detectable objects at high 
redshifts they provide the best possible sampling of structures 
at all scales, their redshift is always measured from the same 
emission feature so redshifts are obtained in a very homoge- 
neous way, and their low masses make them weakly biased 
tracers of the large-scale structure. A natural prediction of hier- 
archical clustering is then the likely detection of filaments and 



pancakes in the 3D distribution of Lya emitters. One such fil- 
ament traced by Lya emitters has been detected at z — 3.04 
(M0ller & Fynbo 2001), but the inferred 3D properties of fil- 
aments will be modified by peculiar velocities and to recover 
their true properties it is necessary to understand those effects 
that can be divided into three distinct components. 

The streaming velocity of galaxies on the observed scale of 
the filament will change the mean redshift by a small amount, 
~ 150 km s _1 on scales of tens of Mpc, amounting to a neg- 
ligible shift in redshift of 5 x 10~ 4 . The gradient component 
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will distort the viewing angle of the filament; in particular the 
relative (systematic) error on the line-of-sight dimension of the 
filament will be: 



&(1 

OX \ 



H(z) 



(16) 



(The gradient is multiplied by (1 + z) because the Hubble con- 
stant is defined in terms of physical distance, in place of co- 
moving). At that scale the gradient will be of about 10 km s _1 
Mpc~\ and the relative error will be 0.13 (for a Hubble con- 
stant of 312 km Mpc~\ which is the Hubble constant at 
z = 3 in the assumed cosmology). This will also be the relative 
error of the arc cosine of the viewing angle. The corresponding 
systematic error on the inclination angle will hence typically 
be about 2-3°, which is similar to the 1.9° error due to sparse 
sampling on the inclination angle of the z — 3.04 filament 
(Weidinger et al. 2002). 

Random velocities will thicken the filament. For our test 
case, velocities just above 100 km s _1 are expected, so they 
will contribute in a similar way as the typical uncertainty in the 
redshift. 

These effects should be taken into account when estimat- 
ing, for instance, the cosmological parameters by applying the 
extended Alcock-Paczyhski test on the distribution of viewing 
angles (M0ller & Fynbo 2001; Weidinger et al. 2002). 

4.2. Enhancement of clustering in redshift space 

The power of the approach presented here goes beyond a sta- 
tistical quantification of the effects of the velocity components. 
We illustrate this point by giving an example of interpretation 
of data based on simulated catalogues of Lya emitters. 

Fynbo et al. (2003) detected a significant degree of redshift 
clumping in the field around a DLA toward the quasar Q2138- 
4427 (at z — 2.85). This is clearly visible in their Fig. 8, 
where redshifts clump into a limited interval, much narrower 
than the redshift-depth corresponding to the filter. In the other 
field of that study (Q1346-0322 at z = 3.15), the redshifts 
are uniformly distributed over the range defined by the filter. 
The clumping can be quantified by er 2 , the root-mean-square of 
the redshift distribution, found to be 0.018 (with 19 emitters) 
and 0.006 (with 23 emitters) for the fields of Q 1346-0322 and 
Q2 138-4427 respectively. These tj z values should be compared 
to the expected value of 0.019 based on a simple Monte Carlo 
simulation using the filter transmission as selection function. 
Hence, the Q2 138-4427 field clearly shows a significant degree 
of structure. Similar redshift clumping has been reported in the 
fields of two radio galaxies at redshifts z = 2.14 and z = 4.10 
(Pentericci et al. 2000; Venemans et al. 2002). 

It is interesting to ask how often and under what condi- 
tions does similar redshift clumping occur in the simulations? 
Peculiar velocities can influence the clumping of redshifts in 
different ways. While streaming flows shift the whole redshift 
distribution, gradient velocities can increase or decrease the 
dispersion o z . If a mildly non-linear structure (a filament or 
a pancake) is present in the field, it is known that the pecu- 
liar velocity field (its gradient component, in our terminology) 



will tend to flatten it, thus decreasing o z (see, e.g., Strauss & 
Willick 1995). Random velocities will instead tend to increase 

Oz- 

To assess the likelihood of the observed <j z values and the 
influence of peculiar velocities we extract 15 mock catalogues 
from each of the P2 runs. Each mock catalogue is extracted by 
picking random redshift-space volumes with sizes correspond- 
ing to the volume sampled by the observation and selecting all 
DM halos more massive than 3 x 10 11 M contained in the vol- 
ume. The connection between minimal Lya flux and minimal 
halo mass is fixed loosely (see Sect. 4.1-4.3), so the number of 
emitters here is to be considered as indicative. However, as long 
as such small halos trace the same structure nearly indepen- 
dently of mass, <j z should not be affected by this assumption. 
Referring to a filter FWHM of 60 A and a field of view of 6.7 
arcmin, we extract volumes of 12.4 x 12.4 x 47.0 co-moving 
Mpc (the line of sight corresponding to the longer dimension). 
Boxes are required to contain at least three objects. Redshifts 
are computed along the major axis of the extracted volume. 
Fig.[6]shows the resulting er 2 of the redshift distributions of the 
mock catalogues as a function of the number of mock emit- 
ters found in the box which is a measure of overdensity. The 
er 2 values are computed both neglecting and considering pecu- 
liar velocities. The lines show the average and ±l-cr intervals 
of the er 2 distribution. The expected cr z value in the case of no 
clumping is 0.0142; due to the well-known clustering of halos, 
significantly lower values are expected on average. 

The observational points are reported as well. As the filters 
are more similar to Gaussians than to top-hats, the expected o z 
value for a uniform distribution (0.019) is higher than in our 
case which assumed a top-hat (0.014), therefore we multiply 
the observed values by 0.014/0.019 = 0.737. 

As it is apparent, peculiar velocities are responsible for de- 
creasing the value of <j z by some 10% when it is already small; 
these are cases of filaments (or pancakes) seen perpendicularly 
to the line of sight, where the effect of flattening by peculiar 
velocities is largest. The two observed points are well within 
the predicted range, so these fields are by no means rare cases. 
In particular, the low value of o z in the Q2 138-4427 field, cou- 
pled to the moderately high value of the overdensity inferred, 
can be interpreted, as mentioned above, as the effect of a flat- 
tened structure. It is a 1.76er event so equally low values of a z 
will be expected in 4.5% of all observed fields. If peculiar ve- 
locities are neglected, the Q2 138-4427 field turns out to be a 
1.93cr event, only marginally rarer. 

Pentericci et al. (2000) and Venemans et al. (2002) both use 
the observed redshift clumping to argue for substantial over- 
densities around radio galaxies, and claim these suggest the 
detection of a protocluster. They assume that the overdensity 
S can be estimated from 



n obs x fwhmmt or 
rifieid x fwhm 2 



(17) 



where n b s is the observed number density in the field contain- 
ing the radio galaxy, n/i e ;d is an estimate of the number den- 
sity of a reference field and fwhmeiter and fwhm 2 are the full- 
width-at-half-maxima of the filter transmission (transformed to 
redshift space) and the observed redshift distribution, respec- 
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Fig. 6. Redshift dispersion, a z , of Lya emitters selected in a 
narrow-band field as function of the number of emitters. A big 
filled circle and a big star denote the fields around Q 1346-0322 
and Q2 138-4427 (Fynbo et al. 2003), respectively. Filled trian- 
gles denote a z in 165 mock samples of Lya emitters, the mean 
and 1 sigma dispersion are indicated by thick and thin dashed 
lines respectively. In the mocks, Lya emitters are assumed to 
reside in halos more massive than 3 x 10 11 M Q . Full lines and 
filled squares neglect peculiar velocities. The horizontal dot- 
ted line denotes the mean dispersion in the absence of peculiar 
velocities and clustering. Halo clustering decreases a z signifi- 
cantly (dotted line compared to full line) but peculiar velocities 
do not have a strong effect (dashed line compared to full line). 
The observed points fall well within the range covered by the 
mocks. 



tively. However, as seen from Fig. |6| a z is not a decreasing 
function of density. In fact, it is more likely to have a low a z in 
the redshift distribution in a field with few Lya emitters than in 
an overdense field. Therefore, the only valid way of resolving 
whether radio galaxies are located in protoclusters is to obtain 
an accurate measurement of the number density of galaxies in 
blank fields at similar redshifts. 

5. Conclusions 

We have characterised and quantified the effect of peculiar ve- 
locities in the reconstruction of large-scale structure at high 
redshift, with particular attention to Lya emitters as tracers. 

With the aid of PINOCCHIO simulations we have decom- 
posed the velocity field of DM halos into a streaming flow, a 
gradient and a random velocity term, and computed them as 
functions of scale. The dependence of these velocity statistics 
on halo mass, redshift and cosmology has been quantified and 
fitting formulae have been proposed. 



The main effects of these velocity components on the ob- 
servational properties of Lya emitters have been analysed. In 
particular, streaming flows are determined by fluctuations on 
very large scales, and are strongly affected by sample variance, 
but have a modest impact on the interpretation of observations. 
Gradient flows are mostly important, in that they influence the 
quantitative reconstruction of structures like the inclination an- 
gle of filaments, important for applying the extended Alcock- 
Paczyhski test (M0ller & Fynbo 2001), or the root-mean-square 
of the redshift distribution, important to recognise flattened 
structures (pancakes or filaments) perpendicular to the line of 
sight. Random velocities are typically below or of the same or- 
der as the observational uncertainty on the redshift. 

The results presented here have been applied to quantify the 
influence of peculiar velocity on the reconstructed viewing an- 
gles of filaments at z ~ 3. In particular, the effect of streaming 
velocities is negligible, gradient velocities give an error of 2-3° 
degrees, similar but larger than the typical error due to sparse 
sampling, while random velocities add to the ~ 100 km s^ 1 
error on the redshift. Clearly, a proper quantification of such 
errors is necessary to implement an Alcock-Paczyriski test to 
the inclination of filaments. 

As a further example of the power of this approach, we have 
generated mock catalogues of Lya emitters to assess the sig- 
nificance of a detected narrow distribution in redshift in a deep 
exposure. The observation is found to be a ~2cr event corre- 
sponding to a sheet of galaxies seen face on. Peculiar velocities 
give a modest but significant contribution to the narrowness of 
the redshift distribution, and this again corresponds to the dom- 
inant effect of gradient velocities with respect to random veloc- 
ities. Moreover, we do not notice a significant anti-correlation 
between the abundance of emitters, a tracer of overdensity, and 
the degree of dumpiness, at variance with what is assumed by 
Pentericci et al. (2000) and Venemans et al. (2002). 

The results presented here will be important for interpreting 
the upcoming data on the large-scale structure as traced by Lya 
emitters. Further work will be aimed at generating mock cata- 
logues of Lya emitters that closely reproduce the observational 
selection effects, in order to devise tight observational tests for 
the hierarchical clustering model at z 2. 
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Appendix A: Adding long wavelength modes to 

PINOCCHIO 

Tormen and Bertschinger (1996) describe an algorithm to in- 
crease the dynamic range of a simulation by adding long- 
wavelength perturbations after the simulation has been done. 
However, as pointed out by Cole (1997), the algorithm neglects 
the coupling between long-wavelength linear modes and short- 
wavelength non-linear modes, and this strongly affects the clus- 
tering of halos. Fortunately, this is not a problem in PINOC- 
CHIO, since it is easy to correctly incorporate the effect of long 
wavelength modes on the non-linear collapse of structures. We 
begin by giving a very brief overview of the PINOCCHIO al- 
gorithm, and then proceed to describe how one can easily add 
long wavelength modes. 

The standard PINOCCHIO algorithm operates on a realisa- 
tion of a linear density field generated on a regular grid, identi- 
cal to the grid used in the initial conditions of an A^-body simu- 
lation. In a first step, a 'collapse time' is computed for each grid 
point ('particle') using a truncation of Lagrangian perturbation 
theory based on ellipsoidal collapse. The collapse time is the 
time at which the particle is deemed to fall into a high-density 
region (a halo or filament). In the second step, collapsed par- 
ticles are gathered into halos, using an algorithm that mimics 
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the hierarchical build-up of halos (see Monaco et al. 2002a for 
more details). 

The calculation of the collapse times itself also involves 
two steps, (a) a series of linear operations on the initial density 
field, followed by (b) a non-linear calculation. For a Gaussian 
random field, the long- and short-wavelength perturbations are 
by definition independent, therefore it is trivial to perform the 
first step for long and short wavelengths separately. In contrast 
to the Tormen & Bertschinger (1996) implementation, the re- 
sult of the calculation of the two step procedure (i.e. doing long 
and short wavelengths separately) gives identical result to do- 
ing the full calculation, yet requires significantly less computa- 
tion. 

The algorithm works as follows. Take the linear potential 
?/>(q), defined on the vertices q of a grid. The grid spacing 
A, together with the extent of the grid, L, determine the range 
of waves that can be represented, namely between 2A and L. 
However, consider now two grids, with spacings Ai and A2, 
and extents L\ and L2 respectively. Grid 2 represents a higher 
resolution grid contained within grid 1, and we want to add the 
long-wavelength perturbations of grid 1 onto grid 2, increasing 
the dynamic range from L2/A2 to I/1/A2. 

On the vertices of grid 2, we can add the contributions from 
fluctuations on grid 1 and grid 2 to obtain the potential 

V>(q)=V>i(q)+V>2(q), (A.l) 

Clearly ip has contributions from the full range of waves, 
2A2 to L\. Of course the spacing of grid 1 is coarser than of 
grid 2, Ai > A2, so equation ( IA. li involves an interpolation 
from the coarser to the finer grid. But the key point is that, as 
long as the operations we are going to do on ip are linear, we 
can perform them on grids 1 and 2 independently, and just add 
the result at the end to compute the collapse time for the ver- 
tices of the higher resolution grid. The rest of the PINOCCHIO 
calculation now only applies to the high resolution grid, but we 
have to be aware of boundary effects on the edge of the smaller 
grid. 

When initialising the Gaussian fluctuations on these grids, 
we use the power spectrum P(k) 9(fci) on grid 1, and 
P(fc)(l - e(fci))9(fc 2 ), where P{k) is the desired linear 
power-spectrum, and the Heaviside function restricts the con- 
tribution from waves > k\, respectively k 2 - k 2 denotes the 
Nyquist frequency on the high-resolution grid, and k\ should 
be smaller than the Nyquist frequency of the lower-resolution 
grid but larger than 2ir/L2- 

For the box and grid lengths given in Section 12.21 (L 2 = 
65 h' 1 , i 2 /A 2 = 256, L x = 8L 2 = 520 h- 1 Mpc, L 1 /A 1 = 
64), we found that a good choice for k\ = -k jL\. The effective 
dynamic range of these simulations is thus (L1/A2) 3 = 2048 3 , 
whereas the simulation time is more similar to performing 
two 256 3 simulations. Given that the simulation time is dom- 
inated by the fast Fourier transforms on the grid that scale as 
-/Vlog(A^), with N — (L/A) 3 , this is an acceleration of a fac- 
tor of 352, and we effectively perform a 2048 3 simulation in a 
few hours on a PC. 
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which is known to reproduce well the large-scale velocity field 
but to underestimate the small-scale, highly non-linear veloci- 
ties. The latter are the result of infall of halos onto neighbours. 

As the scales of interest are those relative to the large-scale 
structures observed (like filaments), roughly corresponding to 
the cross-over of streaming and random velocities, we con- 
clude that PINOCCHIO is sufficiently accurate for our present 
purpose. 
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Fig.A.l. Velocity statistics comparing PINOCCHIO and N- 
body runs. Upper panels show the 100 Mpc/h simulation of 
Monaco et al. (2002b) for M > 3 x 10 11 M Q at z = 2, lower 
panels the 250 Mpc/ft, simulation of Fontanot et al. (2003) for 

M > 3 X 10 12 M at z = 

Appendix B: pinocchio accuracy in recovering 
peculiar velocity components 

To check the accuracy achieved by PINOCCHIO in predicting 
the three velocity statistics defined in Section l2~3l we compare 
the PINOCCHIO result to those of two different 256 3 N-body 
simulations, the 100Mpc//i run used by Monaco et al. (2002b) 
and Taffoni et al. (2002), and the 250 Mpc//i presented by 
Fontanot et al. (2003). In both cases we run PINOCCHIO with 
a single grid, and on the same initial conditions as the simu- 
lations. The assumed cosmologies are similar to the reference 
one, except that h = 0.65 in the first simulation, and cr§ = 0.8 
in the second one. For the first simulation, the mass resolution 
is a factor of 3 lower, so that 3 x 10 11 M© halos are the smallest 
reliable ones. For the second simulation the mass of the parti- 
cle is ~ 10 11 M , so that only halos with M^5x 10 12 M 
are reliable. In order to have a sufficient number of halos, we 
test this simulation at z = Q. Halos in both simulations have 
been selected with the usual friends-of-friends algorithm with 
a linking length 0.2 times the interparticle distance (Jenkins et 
al. 2001). 

Fig. lA.ll shows the streaming, gradient and random veloc- 
ity statistics for the PINOCCHIO and N-body runs. On scales 
larger than 10 co-moving Mpc at z = 2 (or 20 at z = 0) 
PINOCCHIO systematically underestimates the random velocity 
by ~30 km s while it reproduces fairly well the streaming 
velocity. The gradient component is underestimated at worst 
by ~30%. At smaller scales these underestimates are larger, 
but the qualitative behaviour is always reproduced. 

This level of agreement is expected, because PINOCCHIO 
velocities are based on the Zel'dovich (1970) approximation, 



